# End-to-end multimodal
Qwen2.5 Omni 7B AWQ
Other
Qwen2.5-Omni is an end-to-end multimodal model capable of perceiving multiple modalities including text, images, audio, and video, while generating text and natural speech responses in a streaming manner.
Multimodal Fusion
Transformers English

Q
Qwen
77
8
Qwen2.5 Omni 3B
Other
Qwen2.5-Omni is an end-to-end multimodal model capable of perceiving various modalities including text, images, audio, and video, while synchronously generating text and natural speech responses in a streaming manner.
Multimodal Fusion
Transformers English

Q
Qwen
48.07k
219
Featured Recommended AI Models